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(54) Client-server based speech recognition 

(57) A user dictionary, which is formed by storing 
pronunciations and notations of target recognition 
words designated by the user in correspondence with 
each other, input speech recognition data, and diction- 
ary management data used to determine the recognition 
field of a recognition dictionary used in recognition of 
the speech recognition data are sent to a server via a 
communication module. In the server, a dictionary man- 
agement unit looks up an identifier table to determine a 
recognition dictionary corresponding to the dictionary 
management information received from a client from a 
plurality of kinds of recognition dictionaries. A speech 
recognition module recognizes the speech recognition 
data using at least the determined recognition diction- 
ary. The recognition result is sent to the client via a com- 
munication module. 
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Description 

FIELD OF THE INVENTION 

[0001 ] The present invention relates to a client-server 
speech recognition system for recognizing speech input 
at a client by a server, a speech recognition server, a 
speech recognition client, their control method, and a 
computer readable memory. 

BACKGROUND OF THE INVENTION 

[0002] In recent years, speech is used as an input in- 
terface in addition to a keyboard, mouse, and the like. 
[0003] However, the recognition rate of speech rec- 
ognition that recognizes input speech lowers and re- 
quires a longer processing time as the number of rec- 
ognition words which are to undergo speech recognition 
becomes larger. For this reason, in an actual method, a 
plurality of recognition dictionaries or lexicons that reg- 
ister recognition words (e.g., pronunciations and nota- 
tions) which are to undergo speech recognition are pre- 
pared, and are selectively used (a plurality of recognition 
dictionaries may be used at the same time). 
[0004] Also, unregistered words cannot be recog- 
nized. As one of methods for solving this problem, a user 
dictionary or lexicon (prepared by the user to register 
recognition words which are to undergo speech recog- 
nition) may be used. 

[0005] On the other hand, a client-server speech rec- 
ognition system has been studied to implement speech 
recognition on a terminal with insufficient resources. 
[0006] These three techniques are known to those 
who are skilled in the art, but a system that combines 
these three techniques has not been realized yet. 

SUMMARY OF THE INVENTION 

[0007] The present invention has been made to solve 
the above problems, and has as its object to provide a 
speech recognition system which uses a user dictionary 
in response to a user's request in a client-server speech 
recognition system so as to improve speech input effi- 
ciency and to reduce the processing load on the entire 
system, a speech recognition server, a speech recogni- 
tion client, their control method, and a computer reada- 
ble memory. 

[0008] According to the present invention, the forego- 
ing object is attained by providing, a client-server 
speech recognition system for recognizing speech input 
at a client by a server, 

the client comprising: 

speech input means for inputting speech; 
user dictionary holding means for holding a us- 
er dictionary formed by registering target rec- 
ognition words designated by a user; and 



transmission means for transmitting speech 
data input by said speech input means, diction- 
ary management information used to deter- 
mine a recognition field of a recognition diction- 
5 ary used to recognize the speech data, and the 

user dictionary to the server, and 

the server comprising: 

10 recognition dictionary holding means for hold- 

ing a plurality of kinds of recognition dictionar- 
ies prepared for respective recognition fields; 
determination means for determining one or 
more recognition dictionary corresponding to 
15 the dictionary management information re- 

ceived from the client from the plurality of kinds 
of recognition dictionaries; and 
recognition means for recognizing the speech 
data using at least the recognition dictionary 
20 determined by said determination means. 

[0009] Other features and advantages of the present 
invention will be apparent from the following description 
taken in conjunction with the accompanying drawings, 
25 in which like reference characters designate the same 
or similar parts throughout the figures thereof. 

BRIEF DESCRIPTION OF THE DRAWINGS 
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[0011] Preferred embodiments of the present inven- 
tion will be described in detail below with reference to 



20 



30 [0010] 

Fig. 1 is a block diagram showing tr 
rangement of a speech recognitior 
first embodiment; 
35 Fig. 2 is a block diagram showing th 
rangement of the speech recognitio 
first embodiment; 

Fig. 3 shows the configuration of a 
of the first embodiment; 
40 Fig. 4 shows a speech input window 
bodiment; 

Fig. 5 shows an identifier table of tl 
ment; 

Fig. 6 is a flow chart showing the pr< 
45 by the speech recognition system 
bodiment; 

Fig. 7 shows the configuration of a 
appended with input form identifie 
the third embodiment; and 
50 Fig. 8 shows the configuration of a 
appended with recognition dictionar 
cording to the third embodiment. 

DESCRIPTION OF THE PREFERRED 
55 EMBODIMENTS 
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the accompanying drawings. 
[First Embodiment] 

[0012] Fig. 1 shows the hardware arrangement of a 
speech recognition system of the first embodiment. 
[0013] A CPU 101 systematically controls an entire 
client 100. The CPU 101 loads programs stored in a 
ROM 102 onto a RAM 103, and executes various proc- 
esses on the basis of the loaded programs. The ROM 
1 02 stores various programs of processes to be execut- 
ed by the CPU 101. The RAM 103 provides a storage 
area required to execute various programs stored in the 
ROM 102. 

[001 4] A secondary storage device 1 04 stores an OS 
and various programs. When the client 100 is imple- 
mented using not a general-purpose apparatus such as 
a personal computer or the like but a dedicated appara- 
tus, the ROM 102 may store the OS and various pro- 
grams. By loading the stored programs onto the RAM 
103, the CPU 101 can execute processes. As the sec- 
ondary storage device 104, a hard disk device, floppy 
disk drive, CD-ROM, or the like may be used. That is, 
storage media are not particularly limited. 
[001 5] A network l/F (interface) 1 05 is connected to a 
network l/F 205 of a server 200. 

[0016] An input device 1 06 comprises a mouse, key- 
board, microphone, and the like to allow input of various 
instructions to processes to be executed by the CPU 
101, and can be used by simultaneously connecting 
these plurality of devices. An output device 107 com- 
prises a display (CRT, LCD, or the like), and displays 
information input by the input device 106, and display 
windows which are controlled by various processes ex- 
ecuted by the CPU 1 01 . A bus 1 08 interconnects various 
building components of the client 1 00. 
[0017] A CPU 201 systematically controls the entire 
server 200. The CPU 201 loads programs stored in a 
ROM 202 onto a RAM 203, and executes various proc- 
esses on the basis of the loaded programs. The ROM 
202 stores various programs of processes to be execut- 
ed by the CPU 201. The RAM 203 provides a storage 
area required to execute various programs stored in the 
ROM 202. 

[001 8] A secondary storage device 204 stores an OS 
and various programs. When the server 200 is imple- 
mented using not a versatile apparatus such as a per- 
sonal computer or the like but a dedicated apparatus, 
the ROM 202 may store the OS and various programs. 
By loading the stored programs onto the RAM 203, the 
CPU 201 can execute processes. As the secondary 
storage device 204, aharddiskdevice, floppy diskdrive, 
CD-ROM, or the like may be used. That is, storage me- 
dia are not particularly limited. 

[0019] The network l/F 205 is connected to the net- 
work l/F 1 05 of the client 1 00. A bus 206 interconnects 
various building components of the server 200. 
[0020] The functional arrangement of the speech rec- 



ognition system of the first embodiment will be de- 
scribed below using Fig. 2. 

[0021] Fig. 2 is a blockdiagram showing the functional 
arrangement of the speech recognition system of the 

5 first embodiment. 

[0022] In the client 100, a speech input module 121 
inputs speech uttered by the user via a microphone (in- 
put device 106), and A/D-converts input speech data 
(speech recognition data) which is to undergo speech 

10 recognition. A communication module 1 22 sends a user 
dictionary 124a, speech recognition data 124b, diction- 
ary management information 124c, and the like to the 
server 200. Also, the communication module 122 re- 
ceives a speech recognition result of the sent speech 

15 recognition data 1 24b and the like from the server 200. 
[0023] A display module 1 23 displays the speech rec- 
ognition result received from the server 200 while stor- 
ing it in, e.g., an input form which is displayed on the 
output device 107 by the process executed by the 

20 speech recognition system of this embodiment. 

[0024] In the server 200, a communication module 
221 receives the user dictionary 124a, speech recogni- 
tion data 124b, dictionary management information 
1 24c, and the like from the client 1 00. Also, the commu- 

25 nication module 221 sends the speech recognition re- 
sult of the speech recognition data 124b and the like to 
the client 100. 

[0025] A dictionary management module 223 switch- 
es and selects a plurality of kinds of recognition diction- 

30 aries 225 (recognition dictionary 1 to recognition diction- 
ary N, N: a positive integer) prepared for respective rec- 
ognition fields (e.g., for names, addresses, alphanumer- 
ic symbols, and the like), and the user dictionary 124a 
received from the client 1 00 (may simultaneously use a 

35 plurality of kinds of dictionaries). 

[0026] Note that the plurality of kinds of recognition 
dictionaries 225 are prepared for each dictionary man- 
agement information 124c (input form identifier; to be 
described later) sent from the client 100. Each recogni- 

40 tion dictionary 225 is appended with a recognition dic- 
tionary identifier indicating the recognition field of that 
recognition dictionary. The dictionary management 
module 223 manages an identifier table 223a that stores 
the recognition dictionary identifiers and input form iden- 

45 tifiers in correspondence with each other, as shown in 
Fig. 5. 

[0027] A speech recognition module 224 executes 
speech recognition using the recognition dictionary or 
dictionaries 225 and user dictionary 1 24a designated for 
50 speech recognition by the dictionary management mod- 
ule 223 on the basis of the speech recognition data 1 24b 
and dictionary management information 124c received 
from the client 100. 

[0028] Note that the user dictionary 1 24a is prepared 
55 by the user to register recognition words which are to 
undergo speech recognition, and stores pronunciations 
and notations of words to be recognized in correspond- 
ence with each other, as shown in, e.g., Fig. 3. 
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[0029] The speech recognition data 124b may be ei- 
ther speech data A/D-converted by the speech input 
module 121 or data obtained by encoding that speech 
data. 

[0030] The dictionary management information 124c 
indicates an input object and the like. For example, the 
dictionary management information 124c is an identifier 
(input form identifier) indicating the type of input form 
when the server 200 recognizes inputspeech and inputs 
text data corresponding to that speech recognition result 
to each input form, which defines a speech input window 
displayed by the speech recognition system of the first 
embodiment, as shown in Fig. 4. The client 100 sends 
this input form identifier to the server 200 as the diction- 
ary management information 124c. In the server 200, 
the dictionary management module 223 looks up the 
identifier table 223a to acquire a recognition dictionary 
identifier corresponding to the received input form iden- 
tifier, and determines a recognition dictionary 225 to be 
used in speech recognition. 

[0031] The process executed by the speech recogni- 
tion system of the first embodiment will be explained be- 
low using Fig. 6. 

[0032] Fig. 6 is a flow chart showing the process ex- 
ecuted by the speech recognition system of the first em- 
bodiment. 

[0033] In step S1 01 , the client 1 00 sends the user dic- 
tionary 124a to the server 200. 

[0034] In step S201 , the server 200 receives the user 
dictionary 1 24a from the client 1 00. 
[0035] In step S1 02, when speech is input to an input 
form as a target speech input, the client 100 sends the 
input form identifier of that input form to the server 200 
as the dictionary management information 124c. 
[0036] In step S202, the server 200 receives the input 
form identifier from the client 1 00 as the dictionary man- 
agement information 124c. 

[0037] In step S203, the server 200 looks up the iden- 
tifier table 223a using the dictionary management infor- 
mation 124c to acquire a recognition dictionary identifier 
corresponding to the received input form identifier, and 
determines a recognition dictionary 225 to be used in 
speech recognition. 

[0038] In step S1 03, the client 1 00 sends speech rec- 
ognition data 124b, which is speech-input as text data 
to be input to each input form, to the server 200. 
[0039] In step S204, the server 200 receives the 
speech recognition data corresponding to each input 
form from the client 100. 

[0040] In step S205, the server 200 executes speech 
recognition of the speech recognition data 124b in the 
speech recognition module 224 using the recognition 
dictionary 225 and user dictionary 124a designated for 
speech recognition by the dictionary management mod- 
ule 223. 

[0041] In the first embodiment, all recognition words 
contained in the user dictionary 124asentfromtheclient 
1 00 to the server 200 are used in speech recognition by 



the speech recognition module 224. 
[0042] In step S206, the server 200 sends the speech 
recognition result obtained by the speech recognition 
module 224 to the client 100. 

5 [0043] In step S104, the client 100 receives the 
speech recognition result corresponding to each input 
form from the server 200, and stores text data corre- 
sponding to the speech recognition result in the corre- 
sponding input form. 

10 [0044] The client 100 checks in step S105 if the 
processing is to be ended. If the processing is not to be 
ended (NO in step S105), the flow returns to step S102 
to repeat the processing. On the other hand, if the 
processing is to be ended (YES in step S1 05), the client 

15 1 00 informs the server 200 of end of the processing, and 
ends the processing. 

[0045] It is checked in step S207 if a processing end 
instruction from the client 1 00 is detected. If no process- 
ing end instruction is detected (NO in step S207), the 

20 flow returns to step S202 to repeat the above processes. 
On the other hand, if the processing end instruction is 
detected (YES in step S207), the processing ends. 
[0046] In the above processing, when speech is input 
to an input form as a target speech input, the dictionary 

25 management information 1 24c corresponding to that in- 
put form is sent from the client 100 to the server 200. 
Alternatively, the dictionary management information 
124c may be sent when the input form as a target 
speech input is focused by an instruction from the input 

30 device 106 (the input form as a target speech input is 
determined). 

[0047] In the server 200, speech recognition is made 
after all speech recognition data 124b are received. Al- 
ternatively, every time speech is input as text data to a 

35 given input form, that the portion of speech recognition 
data 1 24b may be sent to the server 200 frame by frame 
(for example, one frame is 10 msec speech data), and 
speech recognition may be made in real time. 
[0048] As described above, according to the first em- 

40 bodiment, in the client-server speech recognition sys- 
tem, since the server 200 executes speech recognition 
of speech recognition data 124b using both an appro- 
priate recognition dictionary 225 and the user dictionary 
1 24a, the speech recognition precision in the server 200 

45 can be improved while reducing the processing load and 
use of storage resources associated with speech rec- 
ognition in the client 100. 

[Second Embodiment] 

50 

[0049] In the first embodiment, if no recognition words 
to be stored in the user dictionary 124a are generated, 
since the user dictionary 124a need not be used, the 
server 200 may use all recognition words in the user dic- 
55 tionary 124a in recognition only when a use request of 
the user dictionary 1 24a is received from the client 1 00. 
[0050] In this case, a flag indicating if the user diction- 
ary 1 24a is used is added as the dictionary management 
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information 124c, thus informing the server 200 of the 
presence/absence of use of the user dictionary 124a. 

[Third Embodiment] 

[0051 ] Since some target recognition words in the us- 
er dictionary 124a are not used depending on an input 
object, situation, and the like, only specific recognition 
words in the user dictionary 124a may be used in rec- 
ognition depending on the input object and situation. 
[0052] In such case, when the user dictionary is man- 
aged by designating input form identifiers for respective 
recognition words, as shown in Fig. 7, only recognition 
words having an input form identifier of the input form 
used in speech input can be used in recognition. Alter- 
natively, a plurality of input form identifiers may be des- 
ignated for a given recognition word. In addition, the us- 
er dictionary may be managed by designating recogni- 
tion dictionary identifiers in place of input form identifi- 
ers, as shown in Fig. 8. 



computer, after the program code read out from the stor- 
age medium is written in a memory of the extension 
board or unit. When the present invention is applied to 
the storage medium, that storage medium stores a pro- 
5 gram code corresponding to the flow chart shown in Fig. 
3. 

[0056] As many apparently widely different embodi- 
ments of the present invention can be made without de- 
parting from the spirit and scope thereof, it is to be un- 
10 derstood that the invention is not limited to the specific 
embodiments thereof except as defined in the append- 
ed claims. 



15 Claims 

1. A client-server speech recognition system for rec- 
ognizing speech input at a client by a server, 



20 the client comprising: 



[Fourth Embodiment] 

[0053] By combining the second and third embodi- 
ments, the efficiency of the speech recognition process 
of the speech recognition module 224 can be further im- 
proved. 

[Fifth Embodiment] 

[0054] Most of the processes of the apparatus of the 
present invention can be implemented by programs. As 
described above, since the apparatus can use a gener- 
al-purpose apparatus such as a personal computer, the 
present invention is also achieved by supplying a stor- 
age medium, which records a program code of a soft- 
ware program that can implement the functions of the 
above-mentioned embodiments to a system or appara- 
tus, and reading out and executing the program code 
stored in the storage medium by a computer of the sys- 
tem or apparatus. In this case, the program code itself 
read out from the storage medium implements the func- 
tions of the above-mentioned embodiments, and the 
storage medium which stores the program code consti- 
tutes the present invention. As the storage medium for 
supplying the program code, for example, a floppy disk, 
hard disk, optical disk, magneto-optical disk, CD-ROM, 
magnetic tape, nonvolatile memory card, ROM, and the 
like may be used. 

[0055] The present invention can also be achieved by 
supplying the storage medium that records the program 
code to a computer, and executing some or all of actual 
processes executed by an OS running on the computer. 
Furthermore, the functions of the above-mentioned em- 
bodiments may be implemented by some or all of actual 
processing operations executed by a CPU or the like 
arranged in a function extension board or a function ex- 
tension unit, which is inserted in or connected to the 



speech input means for inputting speech; 
user dictionary holding means for holding 
a user dictionary formed by registering tar- 
25 get recognition words designated by a us- 

er; and 

transmission means for transmitting 
speech data input by said speech input 
means, dictionary management informa- 
nt? tion used to determine a recognition field 
of a recognition dictionary used to recog- 
nize the speech data, and the user diction- 
ary to the server, and 

35 the server comprising: 

recognition dictionary holding means for 
holding a plurality of kinds of recognition 
dictionaries prepared for respective recog- 
40 nition fields; 

determination means for determining one 
or more recognition dictionary correspond- 
ing to the dictionary management informa- 
tion received from the client from the plu- 
45 rality of kinds of recognition dictionaries; 

and 

recognition means for recognizing the 
speech data using at least the recognition 
dictionary determined by said determina- 
te tion means. 

2. The system according to claim 1 , wherein said rec- 
ognition means recognizes the speech data using 
the recognition dictionary determined by said deter- 

55 mination means, and the user dictionary received 
from the client. 

3. The system according to claim 1 or 2, wherein said 
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speech input means comprises display means for 
displaying an input form as a target speech input, 
and 

the dictionary management information is an 
input form identifier that indicates a type of input 5 
form. 

4. The system according to any of claims 1 to 3, 
wherein the dictionary management information 
contains information indicating if the user dictionary 10 
is used in recognition of the speech data. 

5. The system according to any preceding claim, 
wherein the user dictionary is formed by storing pro- 
nunciations and notations of the target recognition 15 
words in correspondence with each other. 

6. The system according to claim 3, wherein the user 
dictionary is formed by also storing at least one in- 
put form identifier and the target recognition words 20 
in correspondence with each other. 

7. The system according to any preceding claim, 



8. The system according to any preceding claim, 30 
wherein the speech data is data obtained by encod- 
ing that speech data. 

9. A method of controlling a client-server speech rec- 
ognition system for recognizing speech input at a 35 
client by a server, comprising: 



data using at least the recognition dictionary 
determined in the determination step. 

10. The method according to claim 9, wherein the rec- 
ognition step includes a step of recognizing the 
speech data using the recognition dictionary deter- 
mined in the determination step, and the user dic- 
tionary received from the client. 

1 1 . The method according to claim 9 or 1 0, wherein the 
speech input step comprises a display step of dis- 
playing an input form as a target speech input, and 

the dictionary management information is an 
input form identifier that indicates a type of input 
form. 

12. The method according to any of claims 9 to 11, 
wherein the dictionary management information 
contains information indicating if the user dictionary 
is used in recognition of the speech data. 

13. The method according to any of claims 9 to 12, 
wherein the user dictionary is formed by storing pro- 
nunciations and notations of the target recognition 
words in correspondence with each other. 

14. The method according to claim 11 , wherein the user 
dictionary is formed by also storing at least one in- 
put form identifier and the target recognition words 
in correspondence with each other. 

15. The method according to any of claims 9 to 14, 
wherein the user dictionary is formed by also storing 
at least one of recognition dictionary identifiers in- 
dicating recognition fields of the plurality of kinds of 
recognition dictionaries, and the target recognition 
words. 

16. The method according to any of claims 9 to 15, 
wherein the speech data is data obtained by encod- 
ing that speech data. 

17. A computer readable memory that stores a program 
code of control of a client-server speech recognition 
system for recognizing speech input at a client by a 
server, comprising: 

a program code of a speech input step of input- 
ting speech; 

a program code of a user dictionary holding 
step of holding, in the client, a user dictionary 
formed by registering target recognition words 
designated by a user; and 
a program code of a transmission step of trans- 
mitting speech data input in the speech input 
step, dictionary management information used 
to determine a recognition field of a recognition 
dictionary used to recognize the speech data, 



a speech input step of inputting speech; 

a user dictionary holding step of holding, in the 

client, a user dictionary formed by registering 40 

target recognition words designated by a user; 

and 

a transmission step of transmitting speech data 
input in the speech input step, dictionary man- 
agement information used to determine a rec- 45 
ognition field of a recognition dictionary used to 
recognize the speech data, and the user dic- 
tionary to the server; 

a recognition dictionary holding step of holding, 
in the server, a plurality of kinds of recognition 50 
dictionaries prepared for respective recognition 
fields; 

a determination step of determining one or 
more recognition dictionary corresponding to 
the dictionary management information re- 55 
ceived from the client from the plurality of kinds 
of recognition dictionaries; and 
a recognition step of recognizing the speech 



The system according to any preceding claim, 
wherein the user dictionary is formed by also storing 
at least one of recognition dictionary identifiers in- 25 
dicating recognition fields of the plurality of kinds of 
recognition dictionaries, and the target recognition 
words. 
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and the user dictionary to the server; 
a program code of a recognition dictionary hold- 
ing step of holding, in the server, a plurality of 
kinds of recognition dictionaries prepared for 
respective recognition fields; 
a program code of a determination step of de- 
termining one or more recognition dictionary 
corresponding to the dictionary management 
information received from the client from the 
plurality of kinds of recognition dictionaries; and 
a program code of a recognition step of recog- 
nizing the speech data using at least the rec- 
ognition dictionary determined in the determi- 
nation step. 

18. A speech recognition server for recognizing speech 
input at a client, and sending a recognition result to 
the client, comprising: 

reception means for receiving, from the client, 
speech data, dictionary management informa- 
tion used to determine a recognition field of a 
recognition dictionary used to recognize the 
speech data, and a user dictionary formed by 
registering target recognition words designated 
by a user; 

recognition dictionary holding means for hold- 
ing a plurality of kinds of recognition dictionar- 
ies prepared for respective recognition fields; 
determination means for determining one or 
more recognition dictionary corresponding to 
the dictionary management information re- 
ceived from the client from the plurality of kinds 
of recognition dictionaries; and 
recognition means for recognizing the speech 
data using at least the recognition dictionary 
determined by said determination means. 

19. The server according to claim 1 8, wherein said rec- 
ognition means recognizes the speech data using 
the recognition dictionary determined by said deter- 
mination means, and the user dictionary received 
from the client. 

20. The server according to claim 1 8 or 1 9, wherein the 
speech data is data obtained by encoding that 
speech data. 

21. A speech recognition client for sending input 
speech to be recognized to a server, and receiving 
a recognition result of that speech, comprising: 

speech input means for inputting speech; 
user dictionary holding means for holding a us- 
er dictionary formed by registering target rec- 
ognition words designated by a user; and 
transmission means for transmitting speech 
data input by said speech input means, diction- 



ary management information used to deter- 
mine a recognition field of a recognition diction- 
ary used to recognize the speech data, and the 
user dictionary to the server. 

5 

22. The client according to claim 21, wherein said 
speech input means comprises display means for 
displaying an input form as a target speech input, 
and 

10 the dictionary management information is an 

input form identifier that indicates a type of input 
form. 

23. The client according to claim 21 or 22, wherein the 
15 dictionary management information contains infor- 
mation indicating if the user dictionary is used in rec- 
ognition of the speech data. 

24. The client according to any of claims 21 to 23, 
20 wherein the user dictionary is formed by storing pro- 
nunciations and notations of the target recognition 
words in correspondence with each other. 

25. The client according to claim 22, wherein the user 
25 dictionary is formed by also storing at least one in- 
put form identifier and the target recognition words 
in correspondence with each other. 

26. The client according to any of claims 21 to 25, 
30 wherein the user dictionary is formed by also storing 

at least one of recognition dictionary identifiers in- 
dicating recognition fields of the plurality of kinds of 
recognition dictionaries, and the target recognition 
words. 

35 

27. The client according to any of claims 21 to 25, 
wherein the speech data is data obtained by encod- 
ing that speech data. 

40 28. A method of controlling a speech recognition server 
for recognizing speech input at aclient, and sending 
a recognition result to the client, comprising: 

a reception step of receiving, from the client, 
45 speech data, dictionary management informa- 

tion used to determine a recognition field of a 
recognition dictionary used to recognize the 
speech data, and a user dictionary formed by 
registering target recognition words designated 
50 by a user; 

a recognition dictionary holding step of holding 
a plurality of kinds of recognition dictionaries 
prepared for respective recognition fields; 
a determination step of determining one or 
55 more recognition dictionary corresponding to 

the dictionary management information re- 
ceived from the client from the plurality of kinds 
of recognition dictionaries; and 
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a recognition step of recognizing the speech 
data using at least the recognition dictionary 
determined in the determination step. 

29. The method according to claim 28, wherein the rec- 
ognition step includes a step of recognizing the 
speech data using the recognition dictionary deter- 
mined in the determination step, and the user dic- 
tionary received from the client. 

30. The method according to claim 28 or 29, wherein 
the speech data is data obtained by encoding that 
speech data. 

31 . A method of controlling a speech recognition client 
for sending input speech to be recognized to a serv- 
er, and receiving a recognition result of that speech, 
comprising: 

a speech input step of inputting speech; 
a user dictionary holding step of holding a user 
dictionary formed by registering target recogni- 
tion words designated by a user; and 
a transmission step of transmitting speech data 
input in the speech input step, dictionary man- 
agement information used to determine a rec- 
ognition field of a recognition dictionary used to 
recognize the speech data, and the user dic- 
tionary to the server. 

32. The method according to claim 31, wherein the 
speech input step comprises a display step of dis- 
playing an input form as a target speech input, and 

the dictionary management information is an 
input form identifier that indicates a type of input 
form. 

33. The method according to claim 31 or 32, wherein 
the dictionary management information contains in- 
formation indicating if the user dictionary is used in 
recognition of the speech data. 

34. The method according to any of claims 31 to 33, 
wherein the user dictionary is formed by storing pro- 
nunciations and notations of the target recognition 
words in correspondence with each other. 

35. The method according to claim 32, wherein the user 
dictionary is formed by also storing at least one in- 
put form identifier and the target recognition words 
in correspondence with each other. 

36. The method according to any of claims 31 to 35, 
wherein the user dictionary is formed by also storing 
at least one of recognition dictionary identifiers in- 
dicating recognition fields of the plurality of kinds of 
recognition dictionaries, and the target recognition 
words. 



37. The method according to any of claims 31 to 36, 
wherein the speech data is data obtained by encod- 
ing that speech data. 

5 38. A computer readable memory that stores a program 
code of control of a speech recognition server for 
recognizing speech input at a client, and sending a 
recognition result to the client, comprising: 

10 a program code of a reception step of receiving, 

from the client, speech data, dictionary man- 
agement information used to determine a rec- 
ognition field of a recognition dictionary used to 
recognize the speech data, and a user diction- 
's ary formed by registering target recognition 
words designated by a user; 
a program code of a recognition dictionary hold- 
ing step of holding a plurality of kinds of recog- 
nition dictionaries prepared for respective rec- 
20 ognition fields; 

a program code of a determination step of de- 
termining one or more recognition dictionary 
corresponding to the dictionary management 
information received from the client from the 
25 plurality of kinds of recognition dictionaries; and 

a program code of a recognition step of recog- 
nizing the speech data using at least the rec- 
ognition dictionary determined in the determi- 
nation step. 

30 

39. A computer readable memory that stores a program 
code of control of a speech recognition client for 
sending input speech to be recognized to a server, 
and receiving a recognition result of that speech, 
35 comprising: 

a program code of a speech input step of input- 
ting speech; 

a program code of a user dictionary holding 
40 step of holding a user dictionary formed by reg- 

istering target recognition words designated by 
a user; and 

a program code of a transmission step of trans- 
mitting speech data input in the speech input 
45 step, dictionary management information used 

to determine a recognition field of a recognition 
dictionary used to recognize the speech data, 
and the user dictionary to the server. 

50 40. A client-server speech recognition system for rec- 
ognizing speech input at a client by a server, 

the client comprising: 

55 a speech input unit inputs speech; 

a user dictionary holding a user dictionary 
formed by registering target recognition 
words designated by a user; and 
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a transmitter transmits speech data input 43. 
by said speech input means, dictionary 
management information used to deter- 
mine a recognition field of a recognition dic- 
tionary used to recognize the speech data, 5 
and the user dictionary to the server, and 



Processor implementable instructions product for 
causing a programmable computer device to carry 
out the method of any of claims 28 to 37, when the 
instructions product is run on said programmable 
computer device. 



the server comprising: 



a recognition dictionary holding a plurality 10 
of kinds of recognition dictionaries pre- 
pared for respective recognition fields; 
a determination unit determines one or 
more recognition dictionary corresponding 
to the dictionary management information 15 
received from the client from the plurality 
of kinds of recognition dictionaries; and 
a recognition unit recognizes the speech 
data using at least the recognition diction- 
ary determined by said determination 20 
means. 



41 . A speech recognition server for recognizing speech 
input at a client, and sending a recognition result to 
the client, comprising: 25 



a receiver receives, from the client, speech da- 
ta, dictionary management information used to 
determine a recognition field of a recognition 
dictionary used to recognize the speech data, 30 
and a user dictionary formed by registering tar- 
get recognition words designated by a user; 
a recognition dictionary holding a plurality of 
kinds of recognition dictionaries prepared for 
respective recognition fields; 35 
a determination unit determines one or more 
recognition dictionary corresponding to the dic- 
tionary management information received from 
the client from the plurality of kinds of recogni- 
tion dictionaries; and 40 
a recognition unit recognizes the speech data 
using at least the recognition dictionary deter- 
mined by said determination means. 



42. A speech recognition client for sending input 45 
speech to be recognized to a server, and receiving 
a recognition result of that speech, comprising: 



a speech input unit inputs speech; 
a user dictionary holding a user dictionary 50 
formed by registering target recognition words 
designated by a user; and 
a transmitter transmits speech data input by 
said speech input means, dictionary manage- 
ment information used to determine a recogni- 55 
tion field of a recognition dictionary used to rec- 
ognize the speech data, and the user dictionary 
to the server. 
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FIG. 6 
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